Improved Matrix Multiplication by Changing Loop Order

نویسندگان

چکیده

Matrix multiplication has been implemented in various programming languages, and improved performance reported many articles under settings. is of paramount interest to machine learning, a lightweight matrix-based key management protocol for IoT networks, animation, so on. There always need an terms algorithm implementation. In this work, the authors compared run times matrix popular languages such as C++, Java, Python. This analysis showed that Python’s implementation was poor while Java relatively slower C++ All aforementioned use row-major scheme, hence, there are cache misses encountered when through simple looping. contrast, show by changing loop order, more gains possible. Moreover, we evaluated comparing execution time The observed tremendous due better spatial locality. addition, also parallel version same using OpenMP with eight logical cores achieved speed-up seven serial

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph Parsing by Matrix Multiplication

Graph data model is widely used in many areas, for example, bioinformatics, graph databases, RDF. One of the most common graph queries are navigational queries. The result of query evaluation are implicit relations between nodes of the graph, i.e. paths in the graph. A natural way to specify these relations is by specifying paths using formal grammars over edge labels. The answer to the context...

متن کامل

An Improved Combinatorial Algorithm for Boolean Matrix Multiplication

We present a new combinatorial algorithm for triangle finding and Boolean matrix multiplication that runs in Ô(n/ log n) time, where the Ô notation suppresses poly(loglog) factors. This improves the previous best combinatorial algorithm by Chan [4] that runs in Ô(n/ log n) time. Our algorithm generalizes the divide-and-conquer strategy of Chan’s algorithm. Moreover, we propose a general framewo...

متن کامل

Improved output-sensitive quantum algorithms for Boolean matrix multiplication

We present new quantum algorithms for Boolean Matrix Multiplication in both the time complexity and the query complexity settings. As far as time complexity is concerned, our results show that the product of two n× n Boolean matrices can be computed on a quantum computer in time Õ(n3/2+nl3/4), where l is the number of non-zero entries in the product, improving over the outputsensitive quantum a...

متن کامل

Strassen's Matrix Multiplication Algorithm for Matrices of Arbitrary Order

The well known algorithm of Volker Strassen for matrix multiplication can only be used for (m2 ×m2) matrices. For arbitrary (n× n) matrices one has to add zero rows and columns to the given matrices to use Strassen’s algorithm. Strassen gave a strategy of how to set m and k for arbitrary n to ensure n ≤ m2 . In this paper we study the number d of additional zero rows and columns and the influen...

متن کامل

GPU-Accelerated Sparse Matrix-Matrix Multiplication by Iterative Row Merging

We present an algorithm for general sparse matrix-matrix multiplication (SpGEMM) on many-core architectures, such as GPUs. SpGEMM is implemented by iterative row merging, similar to merge sort, except that elements with duplicate column indices are aggregated on the fly. The main kernel merges small numbers of sparse rows at once using sub-warps of threads to realize an early compression effect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mobile Information Systems

سال: 2022

ISSN: ['1875-905X', '1574-017X']

DOI: https://doi.org/10.1155/2022/9650652